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Production in an economy is a set of firms' activities as suppliers and customers; a firm buys goods 
from other firms, puts value added and sells products to others in a giant network of production. 
Empirical study is lacking despite the fact that the structure of the production network is important 
to understand and make models for many aspects of dynamics in economy. We study a nation-wide 
production network comprising a million firms and millions of supplier-customer links by using 
recent statistical methods developed in physics. We show in the empirical analysis scale- free degree 
distribution, disassortativity, correlation of degree to firm-size, and community structure having 
sectoral and regional modules. Since suppliers usually provide credit to their customers, who supply 
it to theirs in turn, each link is actually a creditor-debtor relationship. We also study chains 
of failures or bankruptcies that take place along those links in the network, and corresponding 
avalanche-size distribution. 



I. INTRODUCTION 

The physics community has recently witnessed consid- 
erable development of statistical methods for quantify- 
ing large networks, including biology, information, tech- 
nology, economics and society [1-3]. The development 
enables one to quantify statistical features, modular and 
heterogeneous structures of large networks that are not 
amenable even to visualization. 

Production network in economics refers to a line of eco- 
nomic activities in which firms buy intermediate goods 
from "upstream" firms, put value added on them, and 
sell the goods to "downstream" firms. Net sum of value 
added in the whole network is basically the net total 
production in a nation, that is, gross domestic product 
(GDP). 

Consider a ship manufacturer, for example. The man- 
ufacturer buys a number of intermediate goods includ- 
ing steel materials, mechanical and electronic devices, 
etc. and produces ships. The firm puts value added on 
the products in each process of production. In the up- 
stream side of the ship manufacturer, a processed steel 
manufacturer could be present, which in turn buys in- 
termediate goods such as raw steel and fabricating ma- 
chines. The steel manufacturer may supply its products 
to car manufactures as well. Similarly in the downstream 
side, including retail and wholesale firms. 

The entire line of these processes of putting value 
added in turn, therefore, forms a giant web of produc- 
tion ranging from upstream to downstream, ultimately 
down to consumers. Each firm needs labor and financ- 
ing in addition to intermediate goods, and utilizes these 
inputs to produce outputs in anticipation for return of 
profits. Thus, a real economy has its driving force in 
production, and is fueled by labor and financing. 

There are studies based on models of production 
networks, notably in overlapping communities between 
economists and physicists. They include inventory dy- 
namics [4-6] (see also [7]), suppliers/customers dynamics 
[8], and credit-chain model [9]. These works are, unfortu- 
nately, not based on any empirical study on the structure 



of production networks. It is highly desirable to investi- 
gate the structure on a nation-wide scale to develop these 
insightful models, but such a study has been considered 
a formidable task so far. 

The present paper precisely performs such an empir- 
ical study of the large-scale structure of a production 
network that comprises most firms in a nation as nodes, 
and supplier-customer relations as links. We will find 
that the network is not regular nor random, but pos- 
sesses scale-free degree distribution, disassortativity, as 
well as other statistical properties, and structural mod- 
ules that depend on industrial sectors and geographical 
regions with highly varying modular sizes. 

The structural heterogeneity would have many impor- 
tant consequences in dynamics on a production network. 
For example, demand by firms and consumers down- 
stream will propagate upstream; when foreign countries 
increase demand for ships, it will result in a growing out- 
put of ship manufacturers, which possibly stimulate pro- 
duction of processed steel, raw metals, related machines 
etc. in upstream firms. This propagation will not take 
place homogeneously but heterogeneously. 

Conversely, decreasing demand also causes a chain re- 
action. An individual firm's profit is equal to sales minus 
costs. It may use factors of production in anticipation of 
profit, but always faces uncertainty in ex-post demand, 
labor and financial costs, price change of intermediate 
goods, and so forth. Only a posteriori, profits are deter- 
mined through the interaction of firms in the production 
network. Once a firm goes into a state of financial insol- 
vency or bankruptcy, its upstream firms have its balance- 
sheets deteriorated by losing fractional profits, and may 
eventually go into bankruptcy. We will show that such 
a chain of failure is by no means negligible, due to the 
network structure, and so that this has a considerable 
effect at macroeconomic activity. 

In Section II, we describe how nodes and links are sur- 
veyed and recorded in our dataset of a production net- 
work, in addition to another dataset of an exhaustive list 
of bankruptcies occurred in the network over a certain 
period of time. In Section III, we study the structure of 
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the production network by employing statistical methods 
in [1-3]. In Section IV, we extract community structure 
in the network. In Section V, we examine the chains of 
bankruptcies with a focus on avalanche-size distribution. 
After a brief discussion in Section VI, we conclude in 
Section VII. 



II. SUPPLIER-CUSTOMER LINKS AND 
BANKRUPTCY DATA 

Let us say that a directional link is present as A — > B 
in a production network, where firm A is a supplier to 
another firm B, or equivalently, B is a customer of A. 
While it is difficult to record every transaction of supply 
and purchase among firms, it is also pointless to have a 
record that a firm buys a pencil from another. Necessary 
for our study are data of links such that the relation 
A — > B is crucial for the activity of one or both A and 
B. If at least one of the firms at either end of a link 
nominates the other firm as most important suppliers or 
customers, then the link should be listed. 

Our dataset for supplier-customer links is based on this 
idea. Tokyo Shoko Research, Inc., one of the leading 
credit research agencies in Japan, regularly gathers credit 
information on most of active firms through investigation 
of financial statements and corporate documents, and by 
hearing-based survey at branch offices located across the 
nation. Financial and credit information of individual 
firms are compiled in commercially available databases. 
The credit information of individual firm includes its sup- 
pliers and customers, up to 24 companies for each, that 
are considered to be most crucial for its business activ- 
ities. We assume that the links playing important roles 
in the production network are recorded at either end of 
each link as we describe above, while we should under- 
stand that it is possible to drop relatively unimportant 
links from the data. Although amounts of transactions 
provide information of weights on links, that is of relative 
importance regarding suppliers and customers, such data 
arc only partially available at the moment. We simply 
ignore the weights in this paper. It is also remarked that 
the financial sector is undcr-rcprcscntcd in the database 
as those financial companies' links are not included. 

We have a snapshot of production networks compiled 
in September 2006. In the data, the number of firms is 
roughly a million, and the number of directional links is 
more than four million (see Section III) . The set of nodes 
in the network covers essentially most of the domestic 
firms that are active in the sense that their credit infor- 
mation is required. Attached to each firm is financial in- 
formation of firm-size, which is measured as sales, profit, 
number of employees and their growth, major and mi- 
nor classification into industrial sectors, details of prod- 
ucts, the firm's banks, principal shareholders, and miscel- 
laneous information including geographical location. In 
particular, the industrial sectors are classified into more 
than 1,200 industries and are categorized hierarchically 



TABLE I: Classification of industrial sectors. Third column 
shows numbers of major-groups/groups/industries classified 
in each division. Fourth column are fractional numbers of 
firms in the divisions according to primary industry of each 
firm in the dataset of September 2006. 





divisions' 1 


#class. a 


#firms(%) 


A 


agriculture 


1/4/20 


0.53 


B 


forestry 


1/5/9 


0.05 


C 


fisheries 


2/4/17 

I I 


0.11 


D 


mining 


1/6/30 


0.18 


E 


construction 


3/20/49 


29.92 


F 


manufacturing 


24/150/563 


17.69 


G 


electricity / gas /heat / water 


4/6/12 


0.06 


H 


information / communications 


5/15/29 


2.42 


I 


transport 


7/24/46 


3.54 


J 


wholesale/retail trade 


12/44/150 


29.07 


K 


finance/insurance 


7/19/68 


0.65 


L 


real estate 


2/6/10 


2.61 


M 


food establishments 


3/12/18 


1.46 


N 


medical/health care/ welfare 


3/15/37 


1.03 


O 


education/learning support 


2/12/33 


0.36 


P 


compound services 6 


2/4/8 


0.64 


Q 


services" 


15/68/164 


9.45 


R 


government c 


2/5/5 


0.18 


S 


unable to classify 


1/1/1 


0.03 




Total 


97/420/1,269 


99.98 



"Japan Standard Industrial Classification, Rev. 11, March 2002: 
http : / /www . stat . go . jp/english/index/seido/ sangyo/index .htm 

^Government-affiliated postal services, and agriculture, forestry, 
fisheries and business cooperative associations. 
c Not elsewhere classified. 



into 19 divisions, 97 major groups and more than 400 mi- 
nor groups (see Table I). For example, the manufacturing 
sector (F) is classified into 24 major groups as tabulated 
in Table II. Each firm has industry classification accord- 
ing to the sector it belongs to as primary (also secondary 
and tertiary, if any) industry. 

In addition, we use a database that records "dead" 
firms, namely business failure or bankruptcy. This 
dataset is an exhaustive list of bankrupted firms since Oc- 
tober 2006 for one year, corresponding to the snapshot of 
the network. The data is exhaustive in the sense that any 
bankrupted firm with a total amount of debt exceeding 
10 million yen (roughly 70 thousand euro or 100 thousand 
US dollar) is listed therein. Each record includes the date 
of failure, the total amount of debt when bankrupted and 
categorized causes of bankruptcy. The dataset has high 
quality and its statistical tabulation is employed by the 
Statistics Bureau of government for an official statistics. 
In the production network, 0.5% to 1% of nodes exit in a 
year due to failure (see Section V). Thus, by combining 
the two datasets of supplier-customer links and actual 
failures, one has an opportunity to do an empirical study 
of the dynamics of failure on the production network. 
And this point differs from the previous studies on the 
Japanese production network including [10, 11]. 
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TABLE II: 24 major groups of the manufacturing sector (F). 

id major group #firms(%) 

09 foods ~~ " WJ5 

10 beverages, tobacco and feed 1.95 

11 textile mill products, except apparel/related 3.08 

12 apparel and related finished products 5.00 

13 lumber and wood products, except furniture 3.57 

14 furniture and fixtures 2.44 

15 pulp, paper and paper products 2.90 

16 printing and allied industries 6.51 

17 chemical and allied products 3.28 

18 petroleum and coal products 0.29 

19 plastic products, except otherwise classified 4.98 

20 rubber products 1.10 

21 leather tanning, leather/fur products 0.83 

22 ceramic, stone and clay products 5.09 

23 iron and steel 1.97 

24 non-ferrous metals and products 1.35 

25 fabricated metal products 12.30 

26 general machinery 13.83 

27 electrical machinery, equipment and supplies 5.08 

28 information and communication electronics 1.33 

29 electronic parts and devices 2.57 

30 transportation equipment 3.19 

31 precision instruments and machinery 2.06 

32 miscellaneous manufacturing industries 5.14 



III. STRUCTURE OF PRODUCTION 
NETWORK 

A. Global connectivity 

The production network as a directed graph is not uni- 
directional from upstream to downstream, but is highly 
entangled depending on the products and services that 
each firm produces. Let us first examine the global con- 
nectivity by using a similar graph-theoretical method 
as was performed in the study of the hyperlink struc- 
ture of the world-wide web [12]. The following numbers 
are for the dataset of September 2006, which contains 
1,019,854 firms as nodes of the network excluding all the 
bankrupted firms before the month. 

From a directed graph, one can obtain an undirected 
graph by simply ignoring the direction of links. A weakly 
connected component of the directed graph refers to a 
connected component in the undirected counterpart. The 
production network has a giant weakly connected com- 
ponent (GWCC) comprised of 99.0% (1,009,597 nodes) 
of the whole. The rest are disconnected components, all 
of which are smaller than 12 in size. 

A strongly connected component (SCC) in a directed 
graph is a set of nodes such that for any pair of nodes 
u and v in the set there is a directed path from u to 
v. There exists a giant SCC having the size of 45.8% 
of the GWCC (462,563 nodes). Calling it GSCC, the 
GWCC turns out to be decomposed into mutually dis- 
joint parts as GWCC = GSCC + IN + OUT + TE, where 



IN is the set of non-GSCC nodes, from which one can 
reach a node (so all the nodes) in the GSCC. Symmetri- 
cally, OUT is the set of non-GSCC nodes, to which one 
can go from any node in the SCC. And TE is the rest 
of the GWCC, called tendrils, which consists of nodes 
that have no access to the GSCC and are not reachable 
from it. Hanging off IN and OUT are tendrils contain- 
ing nodes that are reachable from portions of IN, or that 
can reach to portions of OUT, without passing through 
the SCC. See Figure 6 in [2] understanding their defini- 
tions for giant GIN and GOUT as GIN = IN + GSCC 
and GOUT = OUT + GSCC. The IN, OUT and TE are 
composed of 18.0% (182,018), 32.1% (324,569) and 4.0% 
(40,447) nodes, respectively, in the GWCC. 

By comparing the abundance of industrial divisions 
in each of these giant components, we observe that in 
the portion of IN the numbers of firms in the sectors 
of real estate (L), forestry (B), information and commu- 
nications (H) are larger when compared with the corre- 
sponding sectors in the SCC. In the portion of OUT more 
abundant arc medical, health care and welfare (N), food 
establishments (M), education (O). This fact is reason- 
able, because these industries are located cither in the up- 
stream or in the downstream. Nevertheless, all industries 
are basically embedded in the SCC with entanglement. 
We shall study community structure in Section IV. 

The diameter of a graph is the maximum length for all 
ordered pairs (u,v) of the shortest path from u to v. The 
average distance is the average length for all those pairs 
(u,v). We found that the average distance is 4.59 while 
the diameter is 22. 



B. Degree distribution 

In the rest of this paper, we focus on the GWCC ignor- 
ing small disconnected components. Denoting the num- 
bers of nodes and links by N and M respectively, they 
are for the GWCC 

AT = 1,009,597 , (1) 
M = 4,041,442. (2) 

A firm has suppliers for and customers of it, whose 
numbers are in-degree and out-degree, respectively, ac- 
cording to our definition of link direction. We show that 
both have a long-tail distribution. Denoting the degree 
distribution by P{k), cumulative distribution is written 
as P>(k) = ^2ki—kP{k'). We plot the cumulative distri- 
butions for in and out-degrees in Fig. 1. 

Both for in-degree and out-degrees of a firm, the dis- 
tribution has a heavy tail that can be characterized by a 
power-law P>(k) oc fc _Ai . We estimated the exponent \x 
by maximum likelihood (MLE), i.e. the Hill's estimate 
[13], in a tail-region k > fc*. In Fig. 1, the estimates are 
shown for = 40, namely fi = 1.35 ± 0.02 for in-degree 
and [i = 1.26 ± 0.02 for out-degree, by solid lines. Here 
the errors correspond to 1.96cr (99% significance level) of 
the estimated standard errors a. 



09 foods 

10 beverages, tobacco and feed 

11 textile mill products, except apparel/related 

12 apparel and related finished products 

13 lumber and wood products, except furniture 

14 furniture and fixtures 

15 pulp, paper and paper products 

16 printing and allied industries 

17 chemical and allied products 

18 petroleum and coal products 

19 plastic products, except otherwise classified 

20 rubber products 

21 leather tanning, leather/fur products 

22 ceramic, stone and clay products 

23 iron and steel 

24 non-ferrous metals and products 

25 fabricated metal products 

26 general machinery 

27 electrical machinery, equipment and supplies 

28 information and communication electronics 

29 electronic parts and devices 

30 transportation equipment 

31 precision instruments and machinery 

32 miscellaneous manufacturing industries 
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Out-degree k 

FIG. 1: (a) Cumulative distribution for in-degree (number of 
suppliers). A power-law distribution P>(k) oc fc~ M is fitted by 
MLE with ,u = 1.35 ± 0.02 (a solid line), (b) Same for out- 
degree (number of customers). The line is for fi — 1.26 ±0.02. 

The first two moments of in / out-degree are 

(fc in ) = (fc out ) = 4.003 , (3) 

(kl) = 1.041 x 10 3 , <fc 2 u t) - 1-036 x 10 3 . (4) 

For the undirected graph, we have 

(k) = 2M/N = 8.006 , (5) 

(fc 2 ) = 3.070 x 10 3 . (6) 

Firms with largest in-degrees belong to the sectors of 
manufacturing and construction among others, including 
heavy industry, electrical machinery (e.g. Hitachi, Mit- 
subishi, Panasonic, Toshiba), automobiles (Toyota, Nis- 
san, Honda), metal production, and so on. Large con- 
struction companies are also included. Firms with the 
largest out-degrees are worldwide traders, distributors of 
construction-related materials, metals, petroleum, me- 
chanical and electrical instruments, and general whole- 
sale companies, as well as the manufacturing firms men- 
tioned for in-degrees. 

C. Correlation to firm-size 

The number of suppliers/customers of a firm depends 
obviously on firm-size, an important attribute. A large 



firm likely possesses numerous suppliers to buy various 
intermediate goods; similarly it has a number of cus- 
tomers to sell its products to so that it has the large size. 
Firm-size can be measured in different ways, basically by 
stock variables (total-asset, number of employees, etc.) 
or flow variables (sales, profits, etc.). 

The firm-size, however measured, obeys a power-law, 
being well known as a Zipf's law. For the nodes in the 
network, we examined financial data (availability exceeds 
70% presumably missing only extremely small firms). 
The cumulative distribution for the sales of those nodes 
(0.73 million) is shown in Fig. 2 (a). The Zipf's law, 
P>(x) cx x~ a , is obvious for sales x. The exponent is 
close to unity, a = 0.96 ± 0.02 by MLE estimated for 
x > 10 4 million yen. 




Degree (in + out) 

FIG. 2: (a) Cumulative distribution for firm-size measured 
by sales. A power-law distribution P> (x) oc x~ a is fitted by 
MLE with a = 0.96 ± 0.02 (a solid line), (b) Scatter-plot for 
degree (total) and sales. 

Fig. 2 (b) depicts a scatter-plot for total degree and 
sales. Correlation between the firm's degree and size 
is positive. The statistical significance can be quanti- 
fied by non-parametric statistics, such as Kendall's rank 
correlation, r (see [14]). For the data, t = 0.391 (p- 
valuc < 10~ 7 ), which shows a significant positive correla- 
tion between the degree and firm-size. We used different 
quantities for firm-size, such as profits and the number 
of employees, and obtained very similar results. In addi- 
tion, when considering cither of in- or out-degree, we can 
observe that each has a positive correlation with firm- 
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size. 



D. Transitivity 

Unlike many social networks, the supplier of a firm's 
supplier is not likely also to be the firm's supplier, and 
similarly for customer, because such a process of pro- 
duction is redundant for most cases. Transitivity means 
how high the number of triangles is present in the net- 
work (see the review [3]). Here we regard the network as 
an undirected graph. 

Global clustering coefficient is defined by C g = 
(3 x number of triangles) / (number of connected triples) , 
where a connected triple means a pair of nodes that are 
connected to another node. C g is the mean probability 
that two firms who have a common supplier /customer are 
also suppliers/customers of each other. The undirected 
graph of our dataset yields 



1.87 x 10" 



0.187% 



(7) 



To compare this value with that for a class of ran- 
dom graphs having a same degree sequence but randomly 
rewired links, we use the expected value of global clus- 
tering coefficient given by [15] 



C' 



N 



(fc 2 ) - (k) 
(k) 2 



(8) 



Putting the values (5), (6) and (1) into (8), we have 
C g = 1.81 x 10~ 2 . The observed value (7) is, therefore, 
merely 10% of (8), and shows weaker transitivity than 
what is expected by chance. This is reasonable because 
triangular relations, during the selection of suppliers and 
customers, are suppressed in the formation of them. 

The average of local clustering coefficient is, on the 
other hand, equal to 4.58% for the same dataset. 



E. Degree correlation 

For each node, the in-degree and out-degree are highly 
correlated. This is consistent with what we saw in Sec- 
tion III C that each quantity has positive correlation with 
firm-size. 

For each link, to sec the assortative mixing with respect 
to degrees (fci,/c 2 ) at both end of each link [16], or de- 
gree correlation, let us examine the joint distribution for 
(fci,fc 2 ). Here we ignore the direction of links, but even 
when taking the possible four combinations of in/out at 
a directed link, we obtain similar results. To test for the 
assortativity, we calculate the frequency -F(fci,fc 2 ) that 
the pair of k x and k 2 appears at cither end of a link 
in the network. Then compare it with a same quantity 
F T (ki, fc 2 ) that is obtained in a randomized network with 
the same degree sequence. We generated 1,000 random- 
ized networks, and quantify as the ratio F/F T where F T 
is the average for the randomizations. 
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FIG. 3: Joint distribution for degrees (total) at end of each 
link. The value is the ratio of the actual frequency divided 
by what is expected by chance in random networks with the 
same degree sequence as the actual one. 



The result is shown in Fig. 3. One can observe that 
large-degree nodes, large firms, are connected with small- 
degree nodes, small firms. For the hubs referred to in the 
end of Section III B, they have a large number of suppliers 
and customers, but similarly for firms with intermediate- 
size, displaying disassortativity [16]. This can be quanti- 
fied by the Pearson correlation coefficient r for (/ci,fc 2 ). 
For the data, we have 



r = -0.0747 ±0.0002 



(9) 



where the error calculated by the method given in [16]. 
This claims that r is negative with a statistical signifi- 
cance. 



IV. COMMUNITY STRUCTURE 

The global connectivity examined in Section III A 
shows that basically all industries are highly entangled 
with each other within the weakly or strongly connected 
component. Yet the connectivity alone docs not tell how 
dense or sparse the stream of production is distributed 
depending on industrial or geographical groups. Detec- 
tion of community structure is to find how nodes cluster 
into tightly-knit groups with high density in intra-groups 
and with lower connectivity in inter-groups. 

We focus, in this section, on the manufacturing sec- 
tor with 0.12 million firms, in order to understand the 
sector's modular structure by excluding other dominant 
sectors including wholesale and retail trade, which obvi- 
ously have a different role in the stream of production 
from the core of manufacturing sector. 

We use the method of maximizing modularity, intro- 
duced by [17] and implemented for large-scale graphs in 
[18] as a greedy optimization. While considerable stud- 
ies have been conducted to develop various methods for 
community extraction, we use the modularity optimiza- 
tion for its clear interpretation in terms of statistical 
hypothesis (also see [19] for a comparative study). Let 
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TABLE III: Communities extracted for the subgraph composed of manufacturing firms as nodes (about 0.12 million). Modu- 
larity optimization was recursively done for largest communities to obtain the sub-communities, ten of which are shown here. 
In each of them are shown ten firms with largest degrees are listed with names, major groups (primary/secondary/tertiary, if 
any in this order) of industrial sectors (see Table II), and sub-community sizes. 



no. annotation 
01 heavy 
industry 



02 foods 

03 transportation 
equipment 

04 construction 
material 

05 pulp/paper 

06 electronics(a) 

07 electronics(b) 

08 clectronics(c) 

09 electronics (d) 

10 apparel 



firms (major groups; primary/secondary/tertiary), . . . [community-size] 
Mitsubishi Heavy Industries (30/26), Kawasaki Heavy Industries (26/30), Kobe Steel (23/25), 
Ishikawajima-harima Heavy Industries (30/26), Sumitomo Heavy Industries (26), Nippon Steel (23), Kub- 
ota Industries (30/27/23), Mitsui Engineering and Shipbuilding (30), Hitachi Zosen Shipbuilding (26), 
Sumitomo Metal Industries (23) , . . . [7,447] 
Itoham Foods (09), Prima Meat Packers (09), Yamazaki Baking (09), Nisshin Seifun Flour (09), Maruha 
Nichiro Foods (09), Nippon Flour Mills (09), Q.P Foods (09), Nihon Shokken Foods (09), Toyo Suisan 
Foods (09), Ichiban-foods (09), . . . [7,115] 
Honda (30/27), Nissan (30), Toyota Motor (30), Aisin (25/30/27), Mitsubishi Motors (30), Denso (30/27), 
Fuji Heavy Industries (30), Toyota Industries (30/26), Suzuki Motor (30), Isuzu Motors (30), . . . [5,769] 
Sumitomo Osaka Cement (22), Air- Water Industrial Gas (17/18), Kyowa Concrete (22), Hokukon Concrete 
(22), Marukin Steel Materials (23), Mitsubishi Construction Materials (25/22), Hinode Steel/Manhole 
(23/22), Nihon Kogyo Industrial (22/13), Lafarge Aso Cement (22), Maeta Concrete (22), . . . [2,644] 
Oji Paper (15), Rengo Paper (15), Nippon Paper (15), Oji Chiyoda Container (15), Tomoku Container (15), 
Morishigyo Paper (15), Settsu Carton (15), Morishigyo Paper Sales (15), Crown Package (15), Yamato- 
shiki Paper (15/19), ... [3,697] 
Hitachi (28/29/27), Fujitsu (32/28), NEC (28/29), TDK (27/29), Oki Electric (28/29), Hitachi High- 
Technologies (31/26), Rohm Semi-conductors (29), Murata Electronics (27), IBM Japan (28), Japan Radio 
Communication Equipment (28/27), . . . [3,082] 
Matsushita (Panasonic) (27/31), Sharp (29/27/28), Sanyo (27/25), Panasonic Shikoku Electronics 
(29/27/28), Pioneer (27/28), Matsushita Battery (27), Sanyo Tottori (28), Matsushita Refrigeration 
(27/26), Kenwood (28), CMK Electronic Devices (29), . . . [2,921] 
Canon (28/26/31), Seiko Epson (28/29), Omron (27), Nikon (31/26), Ricoh (26/28), Fujinon Optics (31), 
Hoya Optics (31), Casio (26/31/28), Pentax Optics (31/28), Sony EMCS Electronic (27/28), . . . [2,692] 
Toshiba (27/28/29), Stanley Electric (27/26), Toshiba Lighting and Technology (27/26/29), Ushio Electric 
(25/27/26), Hamamatsu Photonics (29/27), Nippon Electric Glass (22), Toshiba Tec (26/27), GS Yuasa 
Industry (27/29), Iwasaki Electric (27), Topcon Electric (31), . . . [2,320] 
Renoun Apparel (12), Onward Kashiyama Apparel (12), MC Knit Apparel (12), World Apparel (12), 
Sanyo Shokai Apparel (12), Itokin Apparel (12), Fujii Fabrics (11), Sanei-International Apparel (12), YKK 
Fastening and Accesaries (32), World Apparel (12), . . . [1,567] 



eij be the fraction of edges in the network that connect 
nodes in group i to those in group j, and let a.j = J2j e iji 
bj = J2i e ij- Then modularity Q is defined by 

Q = Yl^i - ^ ( 10 ) 

i 

which is the fraction of edges that fall within groups, mi- 
nus the expected value of the fraction under the hypothe- 
sis that edges fall randomly irrespectively of the commu- 
nity structure. The method is formulated as an optimiza- 
tion problem to find a partition of nodes into mutually 
disjoint groups such that the corresponding value of Q is 
maximum. 

As shown in [20, 21], however, the method can give 
undesired grouping, depending on the density of connec- 
tions and the network size. Especially, large communi- 
ties can potentially contain sub-communities. Currently, 
without an established method to avoid this problem of 
resolution limit (sec [22], for example, and also [23]), we 
shall check the structure of detected communities by con- 
straining modularity optimization on each single commu- 
nity, especially for those with relatively large community- 
size. 



We apply the method of community extraction to the 
undirected subgraph whose nodes consist of only firms 
in the manufacturing sector (division F in Table I). The 
resulting modularity (10) is Q — 0.566 ±0.001, which in- 
dicates strong community structure (the error calculated 
by the method given in [16]). The number of extracted 
communities exceeds a thousand, whose sizes range from 
a few to more than 10,000. From the database of the 
information on the firms, we found that many of those 
small communities are each located in same geographical 
areas forming specialized production flows. An example 
is a small group of Hour-maker, noodle-foods producers, 
bakeries, and packing/labeling companies in a rural area. 

On the other hand, five large communities exceed 
10,000 each in size, being possibly subject to the 
above problem of resolution. After checking the sub- 
communities in the above mentioned fashion, we obtained 
the communities as tabulated in Table III. The necessity 
of this procedure can be clearly seen for the communities 
of so-annotated "electronics" (a)-(d), which constitute a 
single community in the first stage of optimization. Each 
firm is classified into one or more industrial sectors, and 
the major-group classifications (2 digits; see Table II). 



FIG. 4: (Color online) Layout of nodes for firms in the manufacturing sector (F) by a force-directed method. The links are 
omitted, and different colors are put on the nodes belonging to four largest communities. The community of color red (middle 
bottom and encircled by a dotted line) is divided into three sub-communities of electronics (a), (b) and (c) given in Table III 
(enlarged in the right column). 



Obviously a community contains those firms in closely 
related industrial sectors. The annotations — heavy in- 
dustries, foods, transportation equipment, etc. - are 
made by such observations. 

Let us closely examine the modular structure of those 
large communities. Note that ten firms with the largest 
degrees (typically largest firm-sizes) are listed in each 
community. We note that these large firms in a same 
community do not form a set of nodes that are mutu- 
ally linked in nearly all possible ways, or a quasi-clique. 
Rather, with their suppliers and customers, they form 
a quasi-clique in a corresponding bipartite graph as fol- 
lows. A supplier-customer link u — > v for a set of nodes 
V (it, v £ V) can be considered as an edge in a bipar- 
tite graph that has exactly two copies of V as Vi and V2 
(u € V\ and v £ V2). Those large and competing firms 
quite often share a set of suppliers to some extent, de- 
pending on the industrial sectors, geographical locations 
and so on. 

For example, Honda («i), Nissan (V2) and Toyota (113) 
possibly have a number of suppliers Uj of mechanical 
parts, electronic devices, chassis and assembling ma- 
chines, etc., in common. Then the links form a clique 
or a quasi-clique in the bipartite graph, where most pos- 
sible links from u t to w 1; 1)2, V3, ... are present. This 
forms a portion in the original graph with a higher den- 
sity than other portions. By enumerating cliques in the 
bipartite graph and examining them, we found that this 
is actually the case for the community (03) in Table III, 
and similarly for all the other communities therein. 

For the case of electronics (a)-(d) , those quasi-cliques 



are further separated into groups. Namely, the suppli- 
ers belong to different groups of industrial organization 
for historical reasons and the so-called keiretsu, and/or 
are located in divided geographical sectors. The sub- 
communities (a)-(d) can be considered as such separate 
groups with mutually sparse links. The electronics (b), 
for instance, are originated and developed in an urban 
area in western Japan, not in eastern urban area of the 
Tokyo, being different from group (a). 

These interpretations of modular structure should be 
strengthened by more detailed analysis, especially with 
a new technique for extraction of communities that are 
present in multi-scale levels in the hierarchical organiza- 
tion of the production of network (see [24, 25] for exam- 
ple), which is to be published elsewhere. 

Here, to check the intra-group and inter-group connec- 
tivities, we resort to visualization of the entire manufac- 
turing sector by a graph layout based on a physical sim- 
ulation. The system in the simulation consists of point- 
particles for nodes and springs for links. The springs 
obey Hooke's law with a spring constant, and the parti- 
cles have a Coulomb charge with a same sign, exerting 
repulsive forces inversely proportional to the square of 
mutual distances, for nodes to spread well on the layout. 
A resistance force is also acting on each particle, being 
proportional to its velocity, in order to relax the system 
in a final layout. The Barnes-Hut tree algorithm [26] is 
employed for fast computation, and the Coulomb interac- 
tion was calculated on a special-purpose device (GRAPE; 
gravity pipeline) invented for astrophysical iV-body sim- 
ulation [27]. The result is depicted in Fig. 4. Details of 
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the layout method is given in [28]. 

One can observe that nodes within a tightly-knit group 
cluster at mutually near positions in the layout, while 
different communities are separated from each other with 
overlapping portions. The sub-communities stated above 
appear as clusters nested in the community of electronics. 
Also even closer look in enlargement (not shown in the 
figure) shows blobs corresponding to hubs or large firms 
associated with their suppliers and customers. 



the network at the beginning and went into bankruptcy 
during the period. The rest are of extremely small-size, 
typically with one employee, and were not included as 
nodes, which we assume irrelevant to our purpose as well 
as new entry of firms during the same period. 

Let us define the probability of bankruptcy by 

p = N h /N w 0.620% (11) 



V. CHAIN OF BANKRUPTCY 

Let us now turn our attention to dynamics on the pro- 
duction network. Firms put value added on intermediate 
goods in anticipation for gaining profits — anticipation, 
because no firm knows how much their produced goods 
might actually be demanded by other firms and con- 
sumers. In addition, they face uncertainty in the change 
of costs for intermediate goods to purchase as inputs, as 
well as in fluctuations of labor and financial costs. Only a 
posteriori, therefore, a firm's profit, being equal to sales 
minus costs, is determined through the interaction with 
others in the network. 

Supplier-customer link is a credit relation [29] . When- 
ever one delivers goods to others without an immedi- 
ate exchange of money or goods of full value, credit is 
extended. Frequently, suppliers provide credit to their 
customers, who supply credit to their customers and so 
forth. Also customers can provide credit to their suppli- 
ers so as to have them produce an abundance of interme- 
diate goods beforehand. In either case, once a firm goes 
into financial insolvency state, its creditors will possibly 
lose the scheduled payment, or goods to be delivered that 
have been necessary for production. The influence prop- 
agates from the bankrupted customer to its upstream 
in the former cases, and similarly from the bankrupted 
supplier to its downstream in the latter cases. Thus a 
creditor has its balance-sheet deteriorated in accumula- 
tion, and may eventually go into bankruptcy. This is an 
example of a chain of bankruptcy . 

A bankruptcy chain docs not occur only along the 
supplier-customer links. Ownership relation among firms 
is another typical possibility for such creditor-debtor re- 
lationship. It is, however, also frequently observed in 
our dataset that supplier-customer links are also present 
between holding and held companies, and sibling and re- 
lated firms. We assume that most relevant paths along 
which the chain of bankruptcy occurs are the creditor- 
debtor links of the production network. 

As explained in Section II, we have an exhaustive 
list of bankruptcies. Corresponding to the snapshot 
of the network taken in September 2006, we employ 
all the bankruptcies for exactly one-year period from 
October. The number of bankruptcies amounts to 
roughly 0.13 million, daily mean being 30, and includes 
a few bankruptcies of listed firms. Nearly half of the 
bankrupted firms, precisely Ab = 6264, were present on 



Note that the probability has inverse of time in its phys- 
ical dimension. A year was chosen for the time-scale so 
that it should be longer than the time-scale for financial 
activities of firms, typically weeks and months, and be 
shorter than that for the change of network itself. 



A. Avalanche-size distribution 

Let us first take a look at how a certain size of chain 
of bankruptcies actually takes place. Here a chain is de- 
fined as a set of bankrupted nodes that are connected by 
links that are present in the initial network. If nodes are 
white and black according to survival and bankruptcy 
during the period, a chain means connected black nodes 
surrounded by white nodes, and its size refers to the num- 
ber of black nodes in the chain. 

Fig. 5 shows the size-distribution of such avalanches by 
filled squares, which represents the frequency distribution 
of avalanches with a specific size. The observed values are 
tabulated in Table IV. 
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FIG. 5: Frequency (vertical in log-scale) of avalanches with 
a specific size (horizontal in linear). Filled squares are the 
observed frequencies in the observation. Open circles show a 
theoretical calculation for randomized networks with anony- 
mous nodes (see (a) in the text), and a line with error bars 
represents a Monte-Carlo calculation for randomized networks 
with same bankrupted nodes as observed ones (b). 
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TABLE IV: Comparison between the actual value of the 
chain-bankruptcy and the values expected from coincidence. 
"Obs." for the observed values, "Theor." for the theoretical 
values of coincidence, "O/T" for the ratios between them, 
while "RNW" is the value obtained by simulation for the ran- 
domized network. 





Obs. 


Theor. 


O/T 


RNW 




5507 


6013 


0.9 


5985(21) 


B 2 


226 


76.9 (82.3 b ) 


2.9 


118.5(10.0) 


B 3 


52 


8.6* 


6.0 


10.5(3.2) 


Bi 


17 


2.2* 


7.7 


1.7(1.1) 



"Standard deviations in parentheses (each 1,000 randomizations). 
'Mean-field approximations. 



B. Evaluation of accidental chain 

Let us then evaluate how a certain size of chain of 
actual bankruptcies occurs more or less frequently than 
what is expected simply by chance. Suppose, in a random 
network with a specified degree sequence, one selects Np 
nodes for failure, where p is the probability of failure per 
node. Then calculate the frequency of a certain number 
of failed nodes that are connected by links, and we can 
compare the frequency of accidental chain of failures with 
that for an actual chain of bankruptcies. We shall use 
the terms, failures and failed nodes, to distinguish the 
random selection of black nodes in a random network 
from the actuality, bankruptcies and bankrupted nodes. 

The selection of failed nodes can be done in two ways, 
that is, (a) by choosing uniformly random nodes that 
are anonymous and irrelevant to actual nodes, or alter- 
natively (b) by specifying exactly the same bankrupted 
nodes in the actual data, but in otherwise randomized 
network with the same degree sequence as the real one. 
These two ways possibly yield different results for our 
purpose, so let us perform the evaluation in both ways. 
The evaluation (a) allows us to understand how the ac- 
cidental chain is related to the network properties, espe- 
cially degree distribution and correlation, the results of 
which were given in Section III. On the other hand, we 
can take into account of difference between failures and 
bankruptcies in our terminology here. 

We elaborated the calculation of (a) in Appendix A. 
It should be noted that the calculation of the acciden- 
tal chain in Appendix A takes into account the facts 
of long-tail degree distribution and degree correlation in 
otherwise random networks. For the calculation of (b), 
we calculated the estimates by a Monte Carlo simulation 
generating random networks with failures associated to 
the actually bankrupted nodes. 

We denote by B m the number of clusters, each with m 
failed nodes that are connected by links, in average for 
randomized networks with the same degree distribution 
P(k) as the actual production network. Since the clus- 
tering coefficient is of the same order of magnitude as 
what is expected by chance, we assume that the network 



is tree-like. We shall calculate B m for m = 1,2,3 and 4 
in order. 

The results are summarized in Table IV and Fig. 5 for 
the actually observed values of B m along with the evalu- 
ated values based on the above-mentioned two classes 
of randomized networks, (a) and (b), in the columns 
"Theor." and "RNW" respectively. We find that (a) and 
(b) give quantitatively similar estimates. By comparing 
the actually observed values with the evaluation for ran- 
dom networks with a same degree sequence, we can con- 
clude that the avalanche size has a much heavier tail in its 
distribution for size larger than 3. Those large avalanches 
involve regionally and industrially related firms, as we 
could confirm from our dataset. Therefore, the vulnera- 
ble paths, along which a chain of bankruptcy takes place 
arc present in those modular groups. 



VI. DISCUSSION 

At the very macroscopic level, the real economy has 
the important quantity — the total sum of value added, 
namely GDP in a nation. A production network is a 
giant arena where microscopic agents of firms are com- 
peting and coupled with each other through fluctuations 
in financial activities. It is not straightforward to ag- 
gregate these microscopic fluctuations into macroscopic 
variables; they do not simply sum up due to the hetero- 
geneity of the arena at some mesoscopic levels. It would 
be valuable to cast relevant issues in a macroscopic econ- 
omy into such a mesoscopic description, and to under- 
stand the origin of fluctuations in terms of network and 
models of its dynamics. 

We already mentioned in the introduction about the in- 
fluence of increasing or decreasing demand on upstream 
portions in the network. The demand fluctuations do not 
propagate randomly or uniformly, but depend on modu- 
lar structures of sectors, locations, ownerships, exchang- 
ing people and technologies, etc. Another example is 
the price of goods and services which affect demand to 
and supply of each firm. If a serious rise of metal prices 
occurred, for instance, it may cause chain of price-rise 
downstream of production, including raw steel, processed 
steel manufactures, ship-building, automobile manufac- 
tures, and so on. 

Chain of bankruptcy, which we examined in Section V, 
has a great influence in a nation-wide economy. In fact, 
the total amount of debts for bankrupted firms in a year 
typically ranges from 10 to 25 trillion yen in the last 
10 years, roughly equal to more than 100 billion euro. 
This amounts to 2% or even more of the nominal GDP 
in Japan. Of course, all the debts are not to be lost, 
but it should not be undervalued the fact that there 
are a large number of creditors who have given credits 
to those bankrupted firms [30]. Most vulnerable paths 
would be for firms who have only a limited number of 
customers (out-degrees); losing a single link possibly de- 
teriorates its balance-sheet by a jump process. On the 
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other hand, one may think that large firms with large 
number of customers might not be influenced much by a 
bankruptcy downstream. However, due to the heavy tail 
distribution of degrees, the vulnerable paths that influ- 
ence small-sized firms would be abundant in two or more 
links away. Therefore, the ripple effect is more exten- 
sive than one naively estimated, due to the presence of 
heavy tail distribution of degrees. Also the effect does not 
take place homogeneously but along modular structures 
of industrial sectors and geographical regions, which we 
have examined in this paper. Firms might profit from 
the adoption of methods used in network analysis (see 
[31] for a similar approach in the context of commercial 
banks). 

Finally, let us briefly refer to the dynamics of the 
network. The financial activities of firms typically take 
place in the time-scale of weeks and months, as coupled 
balance-sheet dynamics. However, the coupling is not 
fixed in its relationship. Firms acquire suppliers and cus- 
tomers as alternatives, or to extend their business, and 
also abandon some of them, over a period of years and 
decades. The decision of firms in forming and discarding 
links would be based on an evaluation of costs and ben- 
efits of suppliers and customers, it is an important issue 
how it is related to the slowly-varying structural change 
of the production network, and to what extent success is 
achieved by their strategic activities. 



heavy tail in the degree distribution, is considerable in 
the real economy of the nation. 
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APPENDIX A: EVALUATION FOR 
ACCIDENTAL CHAIN OF FAILURES 

In this appendix, we count the number of accidental 
chain of failures in a given, tree-like network, where all 
the failures occur randomly with probability p per node. 

1. B\: Isolated failure 



VII. SUMMARY 

We studied a large-scale structure of the nation-wide 
production network comprising a million firms and four 
million supplier-customer links in Japan. The set of 
nodes covers most active firms. Each link was chosen 
and considered as important, in a systematic survey of 
credit informations, by at least one of the firms at ei- 
ther end of the link, as its suppliers and customers. We 
found scale-free degree distribution, disassortativity, cor- 
relation of degree to firm-size, and small clustering co- 
efficients compared with randomized networks with the 
same degree sequence. In the community analysis, which 
is based on modularity optimization, we were able to 
identify communities in the manufacturing sector, and 
found that they can be interpreted as modules depend- 
ing on industrial sectors and geographical regions. Large 
communities contained subgroups that can be character- 
ized also by industrial organization and development. 

In addition, by employing an exhaustive list of 
bankruptcies that took place on the production network, 
we took a close look at the size distribution for chains of 
bankruptcies, or avalanche-size distribution. We elabo- 
rated a method to evaluate the frequencies of accidental 
chain in randomized networks, and found that the ac- 
tual avalanche has a heavy tail distribution in its size. 
Combining with the large-scale properties and hetero- 
geneity in modular structures, we claim that the effect 
to a number of creditors, non-trivially large due to the 



Denoting the number of nodes of degree k by K\(k), 
it is related to the degree distribution P(k) by 

P(k) = l^(fc) . (Al) 

Obviously 

oo 

J2Ki(k) = N. (A2) 
fe=i 

Among K\(k) such nodes, the average number of failed 
nodes is pKi(k). Since the probability that all the nodes 
connected to a failed node are not in failure is (1— p) h (see 
Fig. 6), the average number of isolated failure is given by 

oo 

B 1 = Y J PKi(k)(l-p) k =pNR 1 , (A3) 
fe=i 

Ri^((l- P ) k ) , (A4) 
where (•) means average over the nodes, i.e., 

oo 

£/(fc)tfi(A0 
</(*)> = ^ • (A5) 

X>i(*0 
k=l 

Note that since Np is equal to the actual number of 
the bankruptcies, R\ gives the rate of the isolated fail- 
ure. From the observed distribution of K\{k) and p w 
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0.006204, we obtain 



(•) 2 denotes the average over the links, defined by 



Ri 



■ 0.9600 , 
6013.2 . 



(A6) 
(A7) 



The actual number of the isolated bankruptcies is 5,507, 
being 92% of this estimate. Following the standard ar- 
gument for l/\/n estimate of the statistical errors, we 
see that there are less number of isolated bankruptcies in 
actuality than that expected by chance. 

The results (A6) and (A7) apply to any class of net- 
works with the same degree sequence K\{k), in particu- 
lar, irrespectively of degree correlation. 



Bi 




FIG. 6: Clusters of m failed nodes that are connected by links, 
which contribute to -Bi.2,3- Black nodes are failed ones, and 
white nodes are non-failed. The numbers attached with failed 
nodes correspond to the subscripts of degrees, j for kj . 
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We denote by K 2 (ki, k 2 ), the number of nodes of de- 
gree ki connected with a node of degree k 2 . That is, 
choose a node with degree k\, and count the nodes with 
degree k 2 connected with the first node. After doing this 
over all the nodes and by adding up the resulting num- 
bers, one has the quantity K 2 {k\, k 2 ). Now the number 
of double-failure case, B 2 , can be expressed by K 2 (ki,k 2 ) 
as follows: 

oo 



kl ,fe2=l 

2 00 



y fi fei,fe 2 =i 

R 2 = ((l-P) k ^) 2 , 



(A8) 
(A9) 



where the combinatorial factor 1 /2 accounts for the over- 
counting the chain in the reverse order of k\ and k 2 , and 



£ K 2 (ki,k2)f(ki,k 2 ) 

</(*!, *2)>2 = • (A10) 

J2 K 2(ki,k 2 ) 

ki .k 2 = l 

From the definition, K 2 (ki,k 2 ) satisfies the identities: 
K 2 {k 1 ,k 2 )=K 2 {k 2 ,k 1 ) , (All) 

OO 

Y / K 2 (k 1 ,k 2 ) = k 1 K 1 (k 1 ) . (A12) 

k 2 = l 

The two identities lead to the summation formula: 

oo 

K 2 (k l7 k 2 ) = N(k) , (A13) 

ki ,k 2 = l 

which is exactly twice the number of links, as it should 
be. The following identity is also useful. 

oo 

]T K 2 (k u k 2 )k% = N (k) <fcj) 2 = N (k n+1 ) . (A14) 
fci,fe 2 =i 

Using Eq. (A13), B 2 can be put as 



B 2 = 1 2j ^ w R 2 N(k). 



(A15) 



The factor R 2 can be calculated directly from the ac- 
tual values of K 2 (k\, k 2 ). The result is 



which leads to 



R 2 w 0.488 , 



B 2 w 76.9 . 



(A16) 



(A17) 



The observed value is about three times of this estimate, 
as shown in Table IV, which indicates the double-failure 
chain is much more abundant than what is expected by 
chance. 

— Random-network approximation — 

In the case of random network, the estimation reduces 
to a simple expression. In fact, first note that K 2 (k\, k 2 ) 
can be written in terms of K\(k) as follows: 

^ ran) (fc!,fc 2 ) = j^K^hhK^) , (A18) 

because it is equal to the number of nodes of degree 
ki, K±(k), multiplied by the probability of choosing the 
node of degree k 2 , k 2 Ki(k 2 )/ Y^k 2 =i ^2-^1(^2)- Note that 
(A18) satisfies the identities (All) and (A12) as is re- 
quired for the consistency of the calculation. It follows 
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from (A18) that the value of R 2 in this approximation is 
given by the following: 



where 



This yields 



p(ran) _ R 2 
^2 ~~ ■ ft ll 1 



i? 2 ran) « 0.523 



and 



(ran) 1 P p(ran) 



2 (l-p)2 

as being tabulated in Table IV. 



(A19) 



(A20) 



(A21) 



R ( 2 'N(k) ss 82.3 , (A22) 



Since exact evaluation of i?3 involves the evaluation of 
K 3 (ki,k 2 ,k 3 ), which requires a huge computational re- 
source, let us evaluate R 3 by using a random-network 
approximation. By considering attaching the nodes #2 
and #3 successively to the node #1 with equal probabil- 
ity on each links as in the case of K 2 an \ we obtain the 
following: 



^ ran) (fci,fc 2 ,fc 3 



N 2 {k} 



r#l(fcl)fclft2#l(fc2)(*2 - l)fc 3 tfl(*3) > ( A3 °) 



3. B 3 



which satisfies identities (A26)-(A29), except that K 2 is 
replaced by K { ™ u) in (A28). By using the above in (A23) 
we obtain: 



R (ran) _ R 2 gig ( fc2 ) ~ R H ( k ) 



We define ^(Aii, k 2 , fe) in a similar way for K 2 {k\, k 2 ); 
take a node with degree fci, continue the counting to 
nodes with degree fc 2 and then fc 3 (see Fig. 6). B 3 is 
given by 

oo 

B 3 = - Yl K s {k u k 2 M) 

x p(l - p) fcl_1 p(l - p) fc2 " 2 p(l - P) fc3_1 

= 5(r^^" m) ' (A23) 

i? 3 = ((l-p) fel+fe2 +' £3 ) 3 , (A24) 

OO 

^(sum)^ ^ K 3 (h,k 2 ,k 3 ) , (A25) 

fel ,fe2,fc3 = l 

where the combinatorial factor 1/2 is to cancel the over- 
counting, and (-) 3 refers to the average weighted with 
K 3 {k\,k 2 ,k 3 ) in the same manner as that in Eq. (A5) 
and (A10). 

By definition, K 3 (ki,k 2 ,k 3 ) satisfies the following 
identities: 

K 3 (k u 0,k 3 ) = , (A26) 
K 3 {k u k 2 ,k 3 ) = K 3 (k 3 , k 2 ,h) , (A27) 

OO 

K 3 (h,k 2 ,k 3 ) = K 2 (k u k 2 )(k 2 - 1) . (A28) 

fe 3 =i 

Using the identities (A28) and (A12) we find that 

K (sum) = N ^ _ ^ ^ 3 Q9 x 1Q 9^ (A29) 



where 

R 12 = x \ ^ 1 « 0.0450 . (A32) 
(fc ) 

This leads to 

i? 3 ran) w 0.0226 , (A33) 
^ ran) w 8.55 . (A34) 



4. B 4 

The clusters that contribute to £>4 arc illustrated in 
Fig. 7, and are divided into two types as depicted. One 
has to understand that larger clusters are more rare 
events so that statistical errors in observation increase 
drastically. With this in mind, let us perform estima- 
tion, and compare them with the observed values. 




FIG. 7: Two types of clusters that contribute to B4. Non- 
failed nodes (white nodes in Fig. 6) are not drawn. 



— Random-network approximation — 
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Type (a) 

Contribution of the cluster (a) can be written as follows 
using the number of strings k\-k±, Ki{k\,k 2 ,k 3 , fc 4 ); 

^1,^2,^3,^4 — 1 



1 p 4 



~ 2{l-pf RiaKi 
R 4a ^((l-p)^ k >) 4 , 



(sum) 



(A35) 
(A36) 



where definitions of (-) 4 and K^ um ^ should be self- 
evident. 

Let us first calculate K^ um \ The identities satisfied 
by Ki{k\, k 2 , k 3 , /c 4 ) are similar to those of K 3 {k\, k 2 , k 3 ) 
and would be now obvious. Using them, the summations 
over k\ and fc 4 can be carried out as follows, 



K 



(sum) 



K 2 (k 2 ,k 3 )(k 2 -l)(k 3 -l) (A37) 



fe2,fe3 = l 



In the expansion of each summand, all the terms except 
for those of k 2 k 3 allow further summation by repeatedly 
using the identities given so far. On the other hand, since 
the coefficient of the k 2 k 3 -term is K 2 (k 2 ,k 3 ), this term 
should be related to degree correlation. As usual, we 
define the correlation coefficient r x by 



r ^ (fcifc 2 ) 2 - (fci ) 2 
' (kl) 2 -(h) 2 



2 ' 



(A38) 



Using this definition, we can write that 
(k 1 k 2 ) 2 = r 1 (h?) 2 + (l-r 1 )(k)l 



n^ + (l-ri)f^ , (A39) 



(k) 



(k) 



where we used Eq. (A14) and re-labeled the subscripts in 
the degrees. Thus the fc2&3-term reads 



]T hk 2 K 2 (k u k 2 )=N(k)(hk 2 ), 



fei ,fc 2 =i 



(k) 



(A40) 



Putting all the terms together, we have 

^i sum) = N L (k*) + (1 - - 2 <fc 2 ) + (fc) 



The random-network approximation for this case re- 
quires a careful treatment because of the appearance of 
the degree correlation coefficient r\ in the above summa- 
tion formula. Although its value given in (9) is small, it 
has a critical role in the above equations: If we use the 
random-network approximation for K 2 given in (A18), 
we obtain the r x = result; 

00 (k 2 ) 2 
V /cifc 2 ^ ran) (fci, k 2 ) = N±—(- S3 1.188 x 10 12 , 

fc!,fc 2 =l _ 

while the exact value is 



(A42) 



kxk 2 K 2 {k u k 2 ) w 1.342 x 10 11 . 



(A43) 



fei ,fc 2 



The role of the correlation coefficient r\ is evident in 
these values; it brings in partial cancellations between 
the first term and the second term, so that the actual 
value is much smaller than that of the random-network 
value (A42). Note that this is deeply connected with the 
asymptotic behavior of the degree distribution noted in 
Section IIIB: If all the moments of degree is of order 
one, the effect of the correlation coefficient r\ is not this 
drastic. However, due to the degree distribution being 
power-law, the moments (k 2 ) and (fc 3 ) are proportional 
to a positive power of N (we will elaborate on the anal- 
ysis in Appendix refsec:appB) and thus are quite large, 
resulting in the importance of cancellation by r 4 observed 
above. 

For this reason, we evaluate _B 4a in two schemes in the 
following. The first scheme is to use the exact distribu- 
tion for K 2 {k 2l k 3 ) but use the random-network approx- 
imation for ki and fc 4 , so that (A43) is satisfied. The 
second one is to use the random-network approximation 
to all the nodes in K4. We will carry out the calculation 
of both schemes separately. 

— Random-network approximation 1 — 

The first approximation scheme is given by 

K^ nl \k 1 ,k 2 ,k 3 ,k 4 ,) = 1 2 
N z (k) 

x tfi(fci)fci(fc 2 - l)K 2 (k 2 , k 3 )(k 3 - l)MTi(fc 4 ) , 

(A44) 

which is obtained by attaching the #1 and #4 nodes to 
a #2-#3 pair randomly. It is evident that this satisfies 
the identity (A37) and therefore (A41). It then follows 
that 

R (ranl) _ -"-1 1 



Ha 



^(sum) 



(A41) 



x ]T (l-p) k2+ks (k 2 -l)(k 3 -l)K 2 (k 2 ,k 3 ) 

k 2 ,k 3 = l 

w 8.33 x 1CT 3 . (A45) 
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From this wc obtain 



Mir 1 ' » 0-819 ■ 



(A46) 



— Random-network approximation 2 — 

In the second, complete random-network approxima- 
tion, we have 

K^ n2 \k 1 ,k 2 ,k 3} k 4 



N 3 (kf 

X X 1 (fc 1 )fc 1 fc 2 X 1 (fc 2 )(fc 2 - l)fc 3 #l(fc 3 )(fc 3 ~ ^^(M , 

(A47) 

which is obtained by connecting the node #2,3,4 in se- 
quence, or alternatively, by substituting the random net- 
work approximation K 2 an ^ in (A18) for K 2 in (A44). We 
then obtain the following: 



R (ra„2) _ r2 / Rl2(k 2 )~ Rg (k) \ 
- H ^[ (h?)-(k) ) 

w 9.77 x 1CT 4 , 

which leads to 

B ( 4a n2) ~ 0.884 , 
which is very close to -B^f" 1 ^. 

Type (b) 

For the other type of (b), we can write as 

oo 

B Ab = yP 4 53 Mkl,k 2 ,k 3 ,k 4 ) 
ki,k 2 ,k 3 ,k 4 = l 
4 

x (i-p) fci - 3 n(i-p) fc ' -1 

1 P n 7 (sum) 

~ 3!(l-p)^ 4b<74 



(A48) 
(A49) 



(A50) 



where 



4 sum) = ^ J4ki,k 2 ,k 3 ,k 4 ) , (A51) 

and i?4f, is a ratio defined by the above. In this case, we 
denote by J 4 (k 4 ,k 2 , k 3 , k 4 ), the number of the clusters of 
type (b) with the degrees fc, of nodes j^i in Fig. 7. The 
combinatorial factor 1/3! cancels the over-counting of a 
same cluster. The following identities hold: 

Ji{ki,k 2 , k 3 , k 4 ) = J 4 (fci, fc CT (2), fc CT ( 3 ), fc<j(4)), (A52) 

oo 

53 Mki,k 2 ,k 3 ,k 4 ) = K 3 (k 2 ,k 1 ,k 3 )(k 1 -2), (A53) 
fe 4 =i 



where ct(j) represents a permutation of j = 2, 3, 4. Using 
the identities, we have 

oo 

53 J 4 (k u k 2 ,k 3 , k 4 ) = K 2 {k 2 M){ki - l)(fci - 2) , 

(A54) 

which leads to 

jf m) = A ((fc 3 ) - 3 (k 2 ) + 2 (fc)) . (A55) 

— Random-network approximation — 

As seen in (A55), the degree-correlation does not play 
any major role for this type of cluster. So, unlike the 
case of B 4a , let us employ a simple random-network ap- 
proximation of the form: 



J 4 an \ki,k 2 ,k 3 ,k 4 ) 



r#i(fci) 



A 3 (fc} 3 

x fcifca - l)k 3 K 4 {k 3 ){k 4 - 2)k 4 K 1 {k 4 ) , 

(A56) 

which satisfies identities (A52)-(A55) with A 2 replaced 
by A 2 ra . We obtain the following: 

R (ra„) _ r3 gig (g) ~ 3i?12 (fc 2 ) + gll W 

46 " 11 (fc3)-3(fc 2 ) + (fc) 

w 3.52 x 1CT 4 , (A57) 



where 



R 13 = V \ " 1 « 9.57 x 10" 
(fc 3 > 



This leads to 



Bf fc an) w 1.39 . 



(A58) 



(A59) 



APPENDIX B: ANALYTIC ESTIMATES OF R'S 
AND THE ASYMPTOTIC BEHAVIOR OF THE 
DEGREE DISTRIBUTION 

Since the probability of failure p is small, one might 
want to utilize a perturbative evaluation of the ratios, 
R's. Indeed such an analytical expression would be help- 
ful in understanding what essentially determines the rate 
of the chain-bankruptcy. In this Appendix, we show that 
the asymptotic behavior of the degree distribution plays 
the key role. 

Let us denote the probability density function (pdf) 
of the degree k by P(k) and its cumulative distribution 
function (cdf) by 



p > (k)= 53p(fc') 



(Bl) 



k'=k 
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We parametrize the cdf as 
P>(k) ~ 



(B2) 



for large k. It follows that P(k) oc fc _M_1 in the same 
region. For our data of the production network regarded 
as an undirected graph, we have 



(J, R 

k 



1.366 , 
a 2.18 , 



(B3) 
(B4) 



as maximum likelihood estimate (with the standard error 
of fj, being 0.099). 

We define the generating function for the degree dis- 
tribution by 



(B5) 



fc=i 



which satisfy G(0) = 1. The desired ratios are expressed 
in terms of the generating function G(q) as 



Ri 

Rln 



G(q ) 
(k) ' 

G (n) (qa) 
(k n ) 



where (q) is the n-th derivative of G(q) with respect 
to q, and q = - log(l - p) w 6.20 x 10~ 3 . 

One might attempt the following analytic expansion of 
G(q): 

°^ = £ (* - «* + +■■■) P ^ 



(B6) 



This turns out to be not a useful expansion as is shown 
in what follows. Instead, we shall give an improved ex- 
pansion. 

For the distribution (B2), the second moment of degree 
is divergent for fi < 2 in a network with an infinite size. 
It is finite but has a large value for network of a finite 
size. Actually, for our data 



(fc 2 ) = 3069.6 



(B7) 



while (k) = 8.006. So the expansion to the second order 
is a good approximation only for 



9< 



(fcl 
(fc 2 ) 



2.61 x 10~ 3 



(B8) 



but this does not hold in the present case. This is illus- 
trated in Fig. 8, where the solid curve is the actual G(q), 
the curve (a) the first two terms in the expansion (B6), 
the curve (b) all three terms in the expansion (B6). 



(3 



1.1 



1.0 



0.9 





/'(b) 


■ 
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q 



0.01 

q 



0.02 



FIG. 8: The generating function for the degree distribution, 
G(q) for q « 0. The solid curve is the actual plot. The 
first-order and the second-order approximations in Eq. (B6) 
are shown by the dashed line (a) and the dash-dot line (b) 
respectively. The dotted line (c) is the improved expansion 
given by (B15). The vertical line corresponds to the actual 
value of qo = — log(l — p). 



Let us now estimate the order of the coefficients of the 
naive expansion (B6) analytically. The m-th moment of 
degree is dominated by the large k region for m > \i as 



* (max) 



(k m ) oc J2 k m k-^ x 



'(max) 



^k-^dk ~ fc™-^ . (B9) 

(max) v ' 



On the other hand, by considering the node of the largest 
degree being the top of the cdf (B2), we have 



fc(max) « N ■ 

Therefore, we obtain 

(k m ) oc N~ 1+! ? 



(B10) 



(Bll) 



for m > /i. It follows from (Bll) that the m-th term in 
(B6) is of order, 



N 
m 



/ \ 1 



(B12) 



Therefore the m-th order term is of the same order of 
magnitude as the (m + l)-th order term provided that 



in 



~ N lf ^q w 154.7 , 



(B13) 



meaning that we need much more than 155 terms for the 
expansion (B6) to be useful for evaluation of our ratios. 

An improved approximation can be obtained as fol- 
lows. Let us extract an analytic contribution of the 
power-law tail by means of an analytic continuation: 



/ 



p-—e-« k dk c ^qkom-^ 
k Q 



(B14) 



1G 



TABLE V: The true values and the estimates obtained from 
(B15). 



Ratio 


Exact value 


Estimate 


Difference 


Ri 


0.9600 


0.9611 


0.11% 


Rn 


0.7230 


0.7017 


-2.9% 


Rl2 


4.501 x 10~ 2 


4.575 x 10~ 2 


1.6% 


Rl3 


9.574 x 10~ 4 


9.440 x 10~ 4 


-1.3% 



For 1 < /i < 2, this contribution is of larger power of p 
than that of the second-order, p 2 , term in (B6). There- 
fore, we arrive at the following approximation, 

G(q) = 1- (k)q+ nT(-n)(k qr + ■■■ ■ (B15) 



Alternatively, this expression can be obtained by evalu- 
ating the dominant k ~ fc( m ax) contribution in G(q) — 
(1 — (k) q). Also it should be noted that this expression 
is valid for q ^> l/fc( max ), since we extended the inte- 
gration to k = oo in (B14), instead of cutting it off at 
k = fc(max)- The curve (c) in Fig. 8 depicts the behavior 
of the first three terms on the right-hand side of (B15). 
It is evident that the improved expansion works as an ex- 
cellent approximation as shown in the plot. In fact, the 
comparison between the estimates of the ratios obtained 
from (B15) and the true values are excellent as seen in 
Table. V. 
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